New developments in automatic meeting transcription
نویسندگان
چکیده
In this paper we report on new developments in the automatic meeting transcription task. Unlike other types of speech (such as those found in Broadcast News and Switchboard), meetings are unique in their richer dynamics of human-to-human interaction. An intuitive “fingernail” plot is proposed to visualize such turntaking behavior. We will also show how recognition of short turns can be improved by building a language model tailored specifically for short turns. Out-Of-Vocabulary (OOV) words become a more salient problem in the meeting transcription task, as they are mostly topic words and proper names, lack of which not only causes Word Error Rate (WER) increase, but also limits further use of recognition hypotheses. We describe a prototype system which uses the Web as a source for vocabulary expansion, and present preliminary OOV retrieval results.
منابع مشابه
Progress in automatic meeting transcription
In this paper we report recent developments on the meeting transcription task, a large vocabulary conversational speech recognition task. Previous experiments showed this is a very challenging task, with about 50% word error rate (WER) using existing recognizers. The difficulty mostly comes from highly disfluent/conversational nature of meetings, and lack of domain specific training data. For t...
متن کاملTowards Computer Understanding of Human Interactions
People meet in order to interact disseminating information, making decisions, and creating new ideas. Automatic analysis of meetings is therefore important from two points of view: extracting the information they contain, and understanding human interaction processes. Based on this view, this article presents an approach in which relevant information content of a meeting is identified from a va...
متن کاملThe AMI Meeting Transcription System: Progress and Performance
We present the AMI 2006 system for the transcription of speech in meetings. The system was jointly developed by multiple sites on the basis of the 2005 system for participation in the NIST RT'05 evaluations. The paper describes major developments such as improvements in automatic segmentation, cross-domain model adaptation, inclusion of MLP based features, improvements in decoding, language mod...
متن کاملA CAD System Framework for the Automatic Diagnosis and Annotation of Histological and Bone Marrow Images
Due to ever increasing of medical images data in the world’s medical centers and recent developments in hardware and technology of medical imaging, necessity of medical data software analysis is needed. Equipping medical science with intelligent tools in diagnosis and treatment of illnesses has resulted in reduction of physicians’ errors and physical and financial damages. In this article we pr...
متن کاملOverview of Automatic Speech Recognition for Transcription System in the Japanese Parliament (Diet)
This article describes a new automatic transcription system in the Japanese Parliament which deploys our automatic speech recognition (ASR) technology and has been in official operation since April 2011. The speaker-independent ASR system handles all plenary sessions and committee meetings to generate an initial draft, which is corrected by Parliamentary reporters. To achieve high recognition p...
متن کامل